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Abstract — The performance of a modulation classifier is highly 
sensitive to channel signal-to-noise ratio (SNR). In this paper, we 
focus on amplitude-phase modulations and propose a modula- 
tion classification framework based on centralized data fusion 
using multiple radios and the hybrid maximum likelihood (ML) 
approach. In order to alleviate the computational complexity 
associated with ML estimation, we adopt the Expectation Maxi- 
mization (EM) algorithm. Due to SNR diversity, the proposed 
multi-radio framework provides robustness to channel SNR. 
Numerical results show the superiority of the proposed approach 
with respect to single radio approaches as well as to modulation 
classifiers using moments based estimators. 

Index Terms — Modulation classification, data fusion, ML esti- 
mation, EM algorithm 

I. Introduction 

Modulation classification (MC) is a statistical signal pro- 
cessing problem that deals with determining the modulation 
type of a noisy communication signal. It plays an important 
role in many civilian and miUtary applications, e.g., adaptive 
cognitive radios for satellite communications |[T]. It is well 
known that the optimal classifier (in the Bayesian sense) is 
the likelihood based (LB) classifier Different forms of LB 
classifiers for the MC problem have been proposed in the 
literature |l2l. These include generalized likelihood ratio test 
(GLRT), average likeUhood ratio test (ALRT) and hybrid 
hkeUhood ratio test (HLRT) based classifiers. A thorough 
review of these techniques can be found in [|3l. In this 
paper, we focus on amplitude-phase modulations and consider 
the HLRT approach, where the likelihood function (LF) is 
marginalized over the unknown constellation symbols and then 
the resulting average LF is used to find the ML estimates 
of the remaining unknown signal parameters. These estimates 
are then plugged into the average LFs to perform maximum 
likelihood (ML) classification. We call this approach hybrid 
maximum likelihood classification. 

The performance of an MC system using a single radio 
depends highly on the channel quality, i.e., fading and back- 
ground noise. In addition, some signal parameters, such as 
signal-to-noise ratio (SNR) and/or phase offset, are usually 
unknown which further complicates the classification prob- 
lem. Note that, for an MC problem, the radio receiver acts 
as a sensor, therefore, we use the terms radio and sensor 
interchangeably throughout the paper Receiver diversity is a 
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common technique used in wireless communication systems 
to alleviate channel fading effects for demodulation/symbol 
detection. Similarly, it is natural to argue that using multiple 
radios for modulation classification, i.e., collaborative MC, 
has the potential for improving classification performance 
compared to a single radio especially in the low to mid 
signal-to-noise (SNR) regimes. Inspired by this reasoning, 
collaborative MC approaches have been proposed in |@], ||5], 
161, 13, El- Most of these works are based on the distributed 
detection framework |l9l, where each radio makes a local (hard 
or soft) classification decision and then these decisions are 
fused at a fusion center (FC) to make a global decision ID, iQ, 
Is). The only centralized approach proposed in the literature is 
in Q, where an antenna array is used to receive the unknown 
signal. The authors use moments based estimators to estimate 
the unknown signal parameters to simplify the estimation 
problem. As a result, the estimates in ||4] are obtained by 
ignoring the coupling (due to common received constellation 
symbols) between different antenna elements which results in 
sub-optimality. 

In this paper, we propose a centralized fusion approach 
where raw data (instead of decisions) from local radios as 
in ID are fused at a fusion center to make the global 
classification decision. Although the proposed centralized data 
fusion approach is expected to improve the performance, the 
resulting MC problem is computationally much harder to solve 
than a single radio based MC. In order to alleviate this issue, 
we propose to use the well-known Expectation-Maximization 
(EM) algorithm lITOI . which significantly simplifies the MC 
problem along with its nice convergence properties. In an 
earlier work ifTTI . the EM algorithm was used for the MC 
problem using a single radio under flat fading channels cor- 
rupted by Gaussian mixture noise. Our proposed framework 
along with the problem formulation for centralized fusion 
based MC is different from the problem considered in llTTI 
even though the EM algorithm is suitable for both. Due to 
SNR diversity, the proposed centralized data fusion framework 
significantly improves the MC performance compared to sin- 
gle radio approaches as in lHj. Furthermore, our numerical 
results show that the proposed EM based solution provides 
superior performance compared to the moments based solution 
proposed in [4| with only a small increase in computational 
complexity. 

II. Problem Formulation 

Consider a radio/sensor network with L sensors observing 
the same communication signal with a block of TV constel- 
lation (information) symbols that undergo flat fading. These 
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sensors are located far enough from each other such that they 
experience independent fading. We assume that timing and 
frequency offsets have been perfectly estimated. The received 
baseband observation sequence at sensor I can be written as: 



After averaging over /„, 



ri.n = aie^ '/„ 



(1) 



where I = 1, . . . ,L, n = 0, . . . , N — 1, In is the ji*'* complex 
constellation symbol of the block, w„ is the additive complex 
zero-mean white Gaussian noise with variance Nq, and ai 
and 9i are the channel gain and the channel phase at sensor 
I, respectively. The above signal model is a commonly used 
model in the MC Hterature JU, lUT], El, IE]. In this model, 
{'^i}iLi^ {^i}iLi^ {-^n}n=Q ^re the unknown signal parame- 
ters. In a general modulation classification scenario, in addition 
to the unknown signal parameters, noise power A^o may also 
be unknown. In this case, the unknown parameter vector can 
be expressed as u := [a, 9, 1, A^o], where a :==;[ai, • • ■ , Ql]"^, 
e := [ei,...,eLf and I :^ [!„,..., In We assume 
that noise is independent across sensors. Suppose there are S 
candidate modulation formats under consideration and let 
denote the constellation symbol at time n corresponding to 
modulation i G {!,..., S}. Let r denote the observation vector 
defined as r := [rf , . . . , r^]^ where r; := [n,o, ■ • ■ , n^N-i]^ 
and Hi represents the hypothesis associated with modulation 
format i. Let pi(r|u) p{r\Hi,uj^ denote the conditional 
probability density function (pdf) of r conditioned on the 
unknown modulation format i and the unknown parameter 
vector u, i.e., the likelihood function (LF), which is given 
by 



Pi(r|u) 



(vriVo) 



■ exp 



^ L N-1 



ri.n 



■ me 



(2) 

Note that the LF in (|2]i is parameterized by the constellation 

(i) 

symbols /„ , which represent the modulation format i. This is 
a composite multiple hypothesis testing problem. In a Bayesian 
setting, the optimal classifier in terms of minimum probability 
of classification error is the maximum a posteriori (MAP) 
classifier If there is no information available on a priori 
probabilities, which is usually the case in a noncooperative 
environment, one can use a non-informative prior, i.e., each 
modulation scheme is assigned an identical prior probability. 
This is the assumed scenario in which case the optimal clas- 
sifier takes the form of a maximum likelihood (ML) classifier 
In the hybrid maximum likelihood approach, the LF is 
averaged over the unknown constellation symbols /„ and 
then maximized over the remaining unknown parameters. Let 
u := [a, 6',iVo]. Given /„ and modulation format i, we have 
the following 



J|K(n,n|/n,u). (3) 



'Superscript T denotes vector/matrix transpose. 

^Throughout the paper, we use the notation Pi( ) to denote p(-|J?i). 
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K(ri,„,...,ri.„|u) = ^ n^^^'^'."!^"'^'^'")' (4) 



where Mi and J,™'^*' are the number of constellation symbols 
and the to*'* constellation symbol in modulation i, respectively. 
For example. Mi ~ 2 for BPSK since BPSK has only 
two possible constellation symbols. Note that, in (01, the 
constellation symbols are assumed to have equal a priori 
probabilities, i.e., p{C'''"^\H,) = 1/Mi. This IS a common 
assumption made in communication theory. Without loss of 
generality, we further assume that E{|/i*^|^} = 1, where 
E{-} denotes statistical expectation. In other words, the power 
of constellation symbols is normalized to unity. Using dU, 
Pi(r|u) becomes 



Pi(r|u) 



Mf (ttATo) 



n=0 m=l \ 1 = 1 



aie 



jBl jm,(i) 



(5) 

Taking the natural logarithm of the above LF and discarding 
constant terms, we have the log-likelihood function (LLF) in 
(|6]l shown on the top of next page. In HLRT, the modulation 
format that maximizes the resulting LLF is selected as the 
final decision, i.e.. 



where 



arg maxAj(Uj), 



Ui = argmax Ai(u). 



(7) 



(8) 



From we can make the following observations. The 
problem of finding the global maximum of Ai(u) with respect 
to u is a 2L + 1 dimensional non-convex optimization problem 
which is extremely difficult to solve in general. Further- 
more, there is coupling between the unknowns of different 
sensors due to common unknown constellation symbols. In 
other words, the problem cannot be decoupled into equivalent 
multiple lower dimensional (simpler) optimization problems. 
There is no closed-form analytical solution. Therefore, either 
numerical methods or approximation techniques need to be 
employed. In the following section, we discuss our approach 
for solving this problem which is based on the Expectation- 
Maximization (EM) algorithm. 

III. The em Algorithm 

Suppose modulation format i is under consideration and the 
constellation symbol vector I is known. In this case, we have 
the following closed-form expressions for the ML estimators; 



~ tan 



ai 
1 



Q(I^r;) 



7V-1 L 
n=0 1=1 



(9) 



(10) 



(11) 



3 



Ai{u) = -LN In No 




1=1 



(6) 



where 5R(-) and denote real and imaginary parts of a 
complex number, respectively, and H denotes the Hermitian 
of a complex vector/matrix. From the above closed-form 
expressions, it is clear that when I is known, the maximization 
problem (for estimating a/ and 9i) is decoupled between dif- 
ferent sensors. Due to the fact that the ML estimation problem 
is significantly simpler when the constellation symbols are 
known, we adopt the well-known EM algorithm ifTOl to solve 
this problem. The EM algorithm is an iterative method which 
enables the computation of ML estimates, especially well 
suited to problems where ML estimation is intractable because 
of the presence of unknown (unobserved) data. In our case, 
the constellation symbols represent unobserved data. We can 
formally describe the EM algorithm for our problem in (|8} 
as follows ITOl . Let us define the so-called complete data 
X := Starting from an initial estimate VLf \ the EM 

algorithm performs the following two steps: the expectation 
step (E-step) and the maximization step (M-step). 

E-step: 0(u,|uf^) = E{lnp,(x|uO|r,uf^}, (12) 



M-step: u! 



(t+i) 



arg max(5(ui|uf 



(13) 



Given the fact that the unknown parameter vector u is inde- 
pendent of the transmitted constellation symbols I, the E-step 
in ( fT2b reduces to: 



Q(u,|uf ) = ^lnp,(r|I,u,)P. (l|r,uf)) . (14) 
I 



It is straightforward to calculate lnpi(r|I, u^) using (|2]i. Let 
r« := [?■!,«, • • • , TL.nV and a"'^*^ := Pt [in = /™|r„, uf^^, 
m = l,...,Mi, denote the a posteriori probability of the 
unknown constellation symbol which can be calculated as 



° (l 



rm I " I 

/ r„,u. 



(t) 



Pi [in 



-At) 



Pi (r„|/„ ^/'".uf' ) 

Mi , . 

fc=i ^ ' 



(15) 



In ( fTsT i, we have used the assumption that each data symbol has 
the same a priori probability, i.e.. Pi [l^ ~ /'"ju'*' j ~ ^/Mi, 
m = 1, . . . , Mi- Let us also define 



Mi 



N-1 Mi 



m— 1 



71—0 m— 1 



(16) 

Note that vlP and E'^*'> represent the a posteriori expectations 
of the constellation symbol at time n and the total normalized 
energy of the transmitted discrete-time signal, respectively. 
Substituting ([TSt-dTSb in ( fT4l i and carrying out the maximiza- 
tion in (fljT l by taking the first derivatives and setting them to 



zero, we obtain the following closed form expressions for the 
(t + l)-th step in the EM algorithm: 



3(t+l) 



tan 



,(t+i) 
'■I 



(17) 
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(t+i) 



1 
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(t+i) 



N-l Mi L 
n=0 m=l 1=1 

(19) 

One important property of 



where T^*^ := [ug*"* 
the EM algorithm is that the LP monotonically increases at ev- 
ery iteration and converges to a stationary point [[141 . However, 
this stationary point can be a local maxima, therefore, either 
a good initialization or multiple initializations are needed to 
guarantee convergence to a good stationary point. 

IV. Method of Moments Estimators and EM 
Initialization 

In the literature, there have been attempts to develop simple 
estimators for u due to the complexity associated with the ML 
estimator in the MC problem. These estimators are based on 
the method of moments (MoM) H, O. The MoM estimators 
for ai and iVo are given, respectively, as (|4], ifTSi : 

1/4 

(20) 



4,1 



2-]E{|/Wr} 



^0(0 "Z^^^L — : — ■ 

1=1 Z^fe=iafe,(0 

M2.1 - 



(21) 



E{\ri,n\ } and 



where Nq^ ^.^ 

M4J = Ellj-f nl*} represent second and fourth absolute 
moments of r;.„, respectively. Note that we use a weighted 
average to compute -/Vq^.j by using the noise power estimate 
at each sensor, A'o, weighted by its corresponding channel 
amplitude (i). The estimates of the second and fourth 
absolute moments are given as A/2,i = -^"^ Sn^o^ln.nl'^ 
and M4J ~ S^=^o^ l^-"!"'- Regarding the channel phases, 
the MoM estimators depend on the modulation format under 
consideration. If modulation i is M-PSK, the MoM estimate is 



' Ln 



. Otherwise, if modulation i 



yi.ii) = ' arg [J2n=o 
is M-QAM, 6',,(,) ^ i arg (J2^~q rf „V The MoM estimates 
have been used in the literature ||4], iflSl to replace the ML 
estimates in (|7]i. These modulation classifiers have been named 
Quasi HLRT or QHLRT El, d, [13. Here, we propose 
to use the MoM estimators as initial points (ul*^-*) for the 
EM algorithm explained in Section [III] The EM based ML 
classification is expected to improve the performance over 
QHLRT. However, it clearly requires increased computational 
complexity. Nevertheless, it is still computationally much 
simpler to implement than numerically searching for the global 
ML estimates using 
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V. Numerical Results 

In this section, numerical results are provided to show 
the effectiveness of the proposed EM based solution for MC 
problem. We consider a 3-ary MC scenario where the modula- 
tions under consideration are QPSK, 8-PSK and 16-PSK. The 
channels are modeled as Rayleigh fading channels, i.e., A is a 
Rayleigh distributed random variable with scale parameter cr, 
where the channel SNR is ¥.{A^}lNo = 2a'^/No with fixed 
A^o = 1- Fig- [H shows the probability of correct classification 
(Pc) versus channel SNR for different number of radios. The 
number of samples is fixed at = 500. For comparison, 
we include the results obtained by MoM estimators as in 
m. Low to mid channel SNR regimes are considered as 
this is where the multi-radio approach is expected to provide 
significant performance improvement. First, it is clear from 
Fig.[T]that a centralized data fusion based multi-radio approach 
is the key to improving performance at low to mid SNR 
regimes. For example, at Pc ~ 0.8, an SNR gain of 15 dB 
is attained with 10 radios compared to a single radio. Second, 
the proposed EM based centralized ML estimation provides 
superior performance compared to MoM based estimation. In 
fact, it is surprising to see that the performance of classifiers 
using MoM based estimation degrades as the number of radios 
increases to the point where they are no better than simple 
guessing. This is due to the fact that MoM estimators do 
not always provide meaningful estimates and they do not 
necessarily maximize the LF. Moreover, MoM estimators do 
not take into account coupling between estimates of different 
radio signals due to common constellation symbols. These 
factors result in poor sub-optimality of MoM based modulation 
classifiers. In Fig. |2] the channel SNR is fixed at dB and the 
performance is depicted with respect to sample size (A^). The 
results are similar to those in Fig. [T] As A^ increases, Pc also 
increases. However, it is clear from Fig. |2]that it is better to 
increase the number of radios than to increase the number of 
samples. This is due to the SNR diversity gained by employing 
multiple radios and the fact that each radio experiences flat 
fading. The trade-off here is the cost of the radios as well as 
synchronization overhead that is needed between radios. 

VI. Conclusions 

We proposed a centralized data fusion based modulation 
classification framework using multiple radios and the hybrid 
ML approach. Our proposed solution is based on the EM 
algorithm which significantly simplifies the complicated ML 
estimation problem. Numerical results show that the proposed 
approach is superior to single radio approaches as well as 
classifiers using moments based estimators. 
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Fig. 1. Probability of coiTect classification versus channel SNR. Solid 
lines; EM based ML estimation, dashed lines: MoM estimation. 
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Fig. 2. Probability of correct classification versus number of samples. 
Solid lines: EM based ML estimation, dashed lines: MoM estimation. 
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